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The present invention relates in general to <m. 
proved multiprocessor data processing systems, 
a d in particular to an Improved method and sy - 
tern for maintaining memory coherence m a mu - 
;?ocessar data processing system. Still more par- 5 
iLiarly, the present invention -lates to an m- 
proved method and system for mamtain.ng transla- 
Son lookaside buffer (TLB) coherency m a mu- 
tiprocessor data processing system w.thout requ.r- 
ing the utilization of interprocessor mterrupts 

' Designers of modem state-of-the-art data pro: 
cessing systems are continually attempting to en- 
hance the /performance aspects of such systems. 
One tochnCe tor enhancing ^^^a process-ng sv- 
tem efficiency is the achievement of -^0^ ^le 
tir.es and a low Cycle's-Per-lnstruct.on (CP ) rat,a 
An excellent example of the appl.cat.on of these 
techniques to an enhanced data processmg system 
s the intemational Business Machines Corpora^on 
RISC System/6000 (RS/6000) computer. The 
RS/6000 system is designed to perform well m 
numerically intensive engineering and scen^.c ap- 
plications as well as in multi-user, commerc.al env - 
ronments. The RS/6G00 processor empbys a mul- 
°scTlar implementation, which means that multple 
Instofctions are issued and executed simultaneous- 

The simultaneous issuance and execution of 
multiple instructions requires independent func^on- 
Tur^ts that can execute concurrently wrth a h.gh 
sSion bandwidth. The BS/6000 syst^^^^ 
arhieves this by utilizing separate branch, f.xed 
pS and ^^ating point processing units which are 
preined in nature. In such systems a s,gn>f.cant 
pipeline delay penalty may result from the execu- 
I' o, condilonal branch instructions. C^^nd,ton^ 
branch instructions are instructions wh.ch dictate 
the e taking of a specified conditional branch wi«..n 
a application in response to a selected outcorne of 
he processing of one or more other .nstr« 
Thus by the time a conditional branch instruction 
propagates through a pipeline ^^^^^^^ 
Son position within the queue, it w^l have been 
necessary to load instructions into the queue be- 
hind the conditional branch instruction pnor to re- 
solving the conditional branch in order to avoid run- 

''"'Another source of delays within multiscalar pro- 
cessor systems is the fact that such systems ^pi_ 
cally execute multiple tasks simultaneously^^ch 
ofthese multiple tasks typically has a effective o 
rtual address space which is utilized for execute 
of that task. Locations within such a effective or 
lual address space include addresses wh.c 
"map" to a real address within system memory J 
is not uncommon for a single space within rea 
memory to map to multiple effective or v,rtua 
memor^ addresses within a multiscalar processor 



system. The utilization of effective or virtual ad- 
dresses by each of the multiple tasks creates addi- 
tional delays within a multiscalar processor system 
due to the necessity of translating these addresses 
Into real addresses within system memory, so that 
the appropriate instruction or data may be retneved 
tl 'memoo. and placed within an instruction 
queue for dispatching to one of the multiple .n- 
dependent functional units which make up the mul 
, tiscalar processor system. 

One technique whereby effective or virtual 
memory addresses within a multiscalar processor 
system may be rapidly translated to real memory 
addresses within system memory is the utilization 
5 of a so-called "translation lookaside buffer (TLB)- 
A translation lookaside buHer (TLB) .s a buffer 
which contains translation relationships between ef- 
fective or virtual memory addresses and real rnem- 
ory addresses which have been generated utilizing 
,0 a translation algorithm. While the utilization of 
translation lookaside buffer (TLB) devices provides 
a reasonably efficient method for translating ad- 
dresses, the utilization of such buffers in tightly 
coupled symmetric multiprocessor systems causes 
25 a p^^blem in coherency. In data processing sys^ 
terns in which multiple processors may read fronn 
and write to a common system real memory care 
must be taken to ensure that the memory system 
operates in a coherent manner. That 's. the rnem- 
30 ory system is not pemiitted to become incoherent 
as a result of the operations of multiple processors^ 
Each processor within such a multiprocessor data 
processing system typically includes a ransla^on 
lookaside buffer (TLB) for address transition and 
35 the shared aspect of memory within such systems 
requires that changes to a single translation 
lookaside buffer (TLB) within one processor m a 
multiprocessor system be carefully and consis ent- 
iv mapped into each translation lookaside buffer 
^ (TLB) within each processor within the muftiproces- 
sor computer system in order to maintain coher- 

^"'^ The maintenance of translation lookaside buffer 
(TLB) coherency in prior art multiprocessor sys- 
45 terns is typically accomplished utilizing interproces- 
sor interrupts and software synchronization for all 
translation lookaside buffer (TLB) modifications. 
These approaches can be utilized to ensure coher- 
ency throughout the multiprocessor system; how- 
50 ever, the necessity of utilizing interrupts and soft- 
ware synchronization results in a substantial perfor- 
mance degradation within a multiprocessor com- 
puter system. , 
It should therefore be apparent that a need 
55 exists for a method and system which may be 
utilized to maintain translation lookaside buffer co- 
herency in a multiprocessor data processing sys- 
tem without the requirement for utilizing inter- 
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each of the P^^'^^^ L executing multiple instruc- 
tions, a memory manag 

ory 3"^^ an assoaa .^^^ ^^^^^^ 

for translating effect,^^^^^^^ ^^^^^ 

"^'""'n'ot stlps o broadcasting a translation 
comprising the s eps ^^^^^^ ^^^^^ 

•^"'fesofns to t -ecution of an associated 
'"^ ;^ nn ^kaside buffer invalidate instruction 

cry and a P'^^^'-^y , processors in- 35 

casting a » ^nsla^on ^^^^^^^^ 
data inst-ua.on »«" °"„„ ,„e transla- 

^'"r onlv in esPonse to an acceptance of < 
processors on y in esp .^^^^.^^^^ 

of processors. 



Translation lookaside buffers (TLB) are often 
utilized n e"data processing system to efficiently 
traSate an effective or virtual address to a real 
: Ss Within system memory. In syste- ^^^^ 
include multiple processors which may all access 
..m memorv each processor may include a 
SslTtionrokaside buffer (TLB) for translating ef- 
ec« f ^^^^^ to real addresses and coherency 
between all translation lookaside buffers (TLB) 
therefore be maintained. The method and 
^s ^d sSed herein ma^ 
rS a unique bus structure in response to an 
Secutio cS a translation lookaside buffer inval.da e 
instruction by any processor within a mul- 
^ cv9tem The bus structure is accepted 

t:i:ZS^^ al" g the hus only in response 
?o an absence of a pending translation lookaside 
buffer Validate (TLBl) instruction within each pro- 
eessL Thus, a broadcast translation lookaside 
bu«er invalidate (TLBl) instruction may only b 
executed by the other processors within a mul- 
SocTssor system if it has been accepted by all 
prcessoi wnhin the system. After initiating execu- 
Eon of a translation lookaside buffer invalidate 
TLBl) instruction at all processors within the sys- 
m he execution of pending instructions is tem- 
poTar y erminated until after the translation 
EsLe buffer invalidate (TLBl) instrucion has 
been executed. Thereafter, the execution of instruc- 
Snns!s suspended until all read and write oper- 
' a ons w^hrthe memory queue have achieved 
K^Lnr-v Next all suspended and/or prefetched 
rn°st to s te'refetche'd utilizing the modif^d 
SsSion lookaside buffer (TLB) to ensure that the 

^^^^S:^r^:r- -3 provides arii^ 

:iem n P art cular of the present invention thus 
'p^ores an improved method and system o 
maintaining translation lookaside buffer (TLB) co 
^e^etcy in a multiprocessor data processing sys- 
?e "Xut requiring the utilization of interproces- 

^"r;Sed embodiment of the present inven^ 
tion wSnow be described with reference to the 

—nr^^hSr diagram depi^. 
a r^ ttiprocessor data processing system wh.ch 
^ay be utilised to implement the method and 
«:v.;tpm of the present invention; 
^0 2 a high level block diagram depicting 
ol'multiscalar processor f ^^'n ^J^jf " 
processor data processing system of F.gure 1_ 
Rnure 3 is a more detailed block diagram de- 
piS a t anslation lookaside buffer (TLB) and 
memory management unit (MMU) with.n the 
multiscalar processor of Figure 2. 
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Figure 4 is a high level logic tlowchart -llus rat- 
ing the initiation of a translation 'ookaside buffe 
invalidate (TLBI) Instruction at one mul t scaa^ 
processor within the rriultiprocessor data pro^ 
cessing systerr, of Figure 1 In accordance w.th 
the rrethod and system of the P--" '"^^^^^^ 
Figure 5 is a high level logic flowchar^ .llustrat 
ing the processing of a translation lookaside 
buffer invalidate (TLBI) instruction throughou 
the multiprocessor data processing system of 
Figure 1 in accordance with the method and 
system of the present invention; and 
FLre,6 is a high level logic flowchart .llust at- 
ing the synchronization of multiple translation 
lookaside buffer invalidate (TLBI) instructions 
within the multiprocessor data processing sys- 
Tem of F^ure 1 In accordance with the method 
and system of the present invention. 
With reference now to the figures a-^d 'n par- 
ticular with reference to Figure 1. there is depic e 
a high level block diagram illustrating a murti- 
pjssor data processing system 6 which may be 
Utilized to implement the method and system of the 
present invention. As illustrated, multiprocessor 
L pLssIng system 6 may be constructed^, 
iizing multiscalar processors 10 >«hich are each 
coupled to system memory 18 utilizing bus 8Jn a 
TghSy-coupled symmetric multiprocessor system, 
och as muftiprocessor date processing system 6. 
each processor 10 within multiprocessor date pro- 
cessing system 6 may be utilized to -ad from a^^^^^ 
write to memory 18. Thus, systems and interlocks 
1st be utilized to ensure that the date and 
instructions within memory 18 -main coheren . 

As illustrated within Figure i. and as will be 
explained in greater detail herein, each processor 
to within multiprocessor date processing system^ 
includes a translation lookaside buffer (TLB) 40 
wS ch may be utilized to efficiently translate effec- 
Tv't ^ual addresses for instructions or data into 
real addresses within system '^^'""^y^^^J"^'":. 
of the fact that a translation 'ookf -de buffe^^^^^ 
constitutes a memory space, it •'^POrtant to 
rT,aintein coherency among each translation 
^0 aside buffer (TLB) 40 within multiprocessor 
Sa processing system 6 in order to assure ac- 
curate operation thereof. ^^nirtpri a 
Referring now to Figure 2, there .s depicted a 
hiqh level block diagram of a multiscalar processor 
0 whi h may be utilized te provide multiprocessor 
ata processing system 6 of Figure 1 As lus 
trated multiscalar processor 10 P^^terably includes 
a memory queue 36 which may be utilized to store 
date Tnsuuctions and the like which is read from or 
written to system memory 18 (see Figure 1 by 
:Xcarar processor 10. Date or instructions 
stored within memo^ queue 36 are preferably ac^ 
cessed utilizing cache/memory interface 20 in a 



method well known to those ^avmg skill .n the art 
The Sizing and utilization of cache rnemory sys 
tems is a well known subspecialty within the data 
processing art and not addressed within the 
s nresent application. However, those skilled m the 
art w- appreciate that by utilizing modem asso- 
ciated cache techniques a large , percentage o 
memory accesses may be achieved uttl-zmg data 
remporarily stored within cache/memory interface 

'° instructions from cache/memory inte^ace 20 
are typically loaded into instruction queue 22 which 
preferably includes a plurality of queue positions. 
S^ttypical embodiment of a multiscalar comp^e 
,s system the instruction queue may '"-'"de ^'^^ 
queue positions and thus, in a given cycle, be 
Len zero and eight instructions may be loaded 
1 instruction queue 22. depending upon how • 
manv valid instructions are passed by 
,0 cache/memory interface 20 and how much space .s 
available within instruction queue 22. 

As is typical in such multiscalar processor sys- 
tems, instruction queue 22 is utilized to dispatch 
instmctions to multiple execution ""'^S;^ As depicted 
25 within Figure 2. multiscalar processor 1 0 mcludes a 
" Sting point processor unit 24. a fixed pom^^ P- 
cessor unit 26, and a branch processor unit 28. 
Thus instruction queue 22 may d-spatc^^^-" 
zero and three instructions during a single cycle, 

30 one to each execution unit. 

in addition to sequential instructions dispatahed 

from instruction queue 22. so-called "condi tonal 
branch instructions " may be loaded into mslTuc- 
tion queue 22 for execution by the branch proces 
35 sor. A conditional branch instruction is an mstruc- 
tion which specifies an associated conditional 
branch to be teken within the application in re- 
sponse to a selected outoome of processing one or 
more sequential instructions. In an effort to mini- 
.0 mize run time delay in a pipelined P-cessor sys- 
tem, such as multiscalar processor 10. *e pres 
ence of a conditional branch instruction within the 
instruction queue is detected and an outcor^e of 
the conditional branch is predicted. As should be 
45 apparent to those having skill in the art when a 
■ conditional branch is predicted as "not taken the 
sequential instructions within the ^^f^^^^^^^^^ 
Simply continue along a current path and no 
^rctions are altered. However, if the Pred^^^^^^^ 
50 as to the occurrence of the branch is incorrect, the 
" nst^Son queue must be purged of sequen a 
instruction, which follow the conditional branch in- 
struction in program order and terget instructions 
must be fetohed. Alternately. If the conditiona 
55 branch is predicted as "teken" then the target 
instructions are fetched and utilized to follow the 
conditional branch, if the prediction is -s°lv^d as 
correct. And of course, if the prediction of teken 
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• nnnrrfirt the tarqet instructions must be purged 
r tr seQuel' ir.structior,s which to.iow the 
ronditionai branch instruction in program order 
"^Ti^^:S.--calar.oce.or^^ a 

Tu^o various cor^parisons which may occur 
utSinq the outcome of sequential instructions 
wh h arrprocessed within multiscalar processor 
To Thus Seating point processor un.t 24. f-xed 
^int processor unit 26 and branch processor u a 
S are 'all coupled to condition register 32. The 
sLTof a particular condition within condition reg- 

te 32 may be detected and coupled to branch n 
ToJL?ulx 28 in order to generate target ad- 
Tesse which are then utilized to fetch target 
Sctions in response to the occurrence of a 
condition which initiates a branch. 

Thereafter, a branch processor unit 28 couples 

' fee 20 AS will be appreciated by those hav ng . 
skin in the art, if the target instructions associated 
Sh those fetch addresses are present wi f^m 
:a?he/memory interface 20. those tarQe -n^-c- 
V c loaded into instruction queue 22. Alter 
TaSv the taS inductions may be fetched from 
mtn^o'ry 18 and thereafter loaded into instruction 
nue^e 22 from cache/memory interface 20 after a 
S requ red to fetch those target instructions. 

L tSose skilled in the art will appreciate, each 
task wi hin multiscalar processor 10 will typically 
have Associated therewith an effective or virtual 
Lrw ^nace and instructions necessary to im- 
memory space and ^^^^ ^.^^.^ ^^^^ 

tilSng rctive or virtual addressesjhus 
fetcher 30 must be able to determine the real 
add e s ^or instructions from the effective addre s- 
es ' ed by each task. As described above, pno 
S irnpTementations of fetcher 30 typ-caHy e.the 
tZoL a complex translation lookaside buffer 

TLB) sequence register and multiple transia ,o 
lor hms or. alternately, such instruction fetcher 
a?e equired to access a memory management unj 
having such complex translation capabn.ty 

n order to determine real instruction addresses 

-rS^rh!;==-^ 

r^^raSLs Which may be^tij^^ 
Tanslate each effective address w.lhin multiscalar 
P;t?or 1 0 into real address within system mem- 



ory 18. Fetcher units typically have a very low 
priority for accessing a memory management un t 
(S and tiierefore some delay is expected m 
the obtaining of real instruction address utilizing a 
memory management unit (MMU). 

With reference now to Figure 3. there is ae 
Dieted a more detailed block diagram illustrating a 
a^on lookaside buffer (TLB) and memory 
"anagament unit (MMU) within multiscalar procas- 
of Figure 2. As illustrated w-th.n Figure 3 
th^ relationship between cache/memory interface 
20 fetcher 30 and memory management unit 
?MMU) 34 is depicted. As is typical m known 
memory management units, memory management 
unit (MMU) 34 includes a substantially sized trans- 
i lookaside buffer (TLB) 40. Those skil^d -n 
art will appreciate that a translation lookaside 
buffer (TLB) is often utilized as a fairly rapid ech- 
nique or translating from effective or -"^"al ad^ 
, dress to real address. Also present within memo^ 
ma agement unit (MMU) 34 is PTE translator 42 
Tnd BAT translator 44. PTE translator 42 is prefe - 
abt utilized to implement page table type tran la- 
fon and BAT translator 44 is utilized to translate 
. aSdress block type translations. Those skilled in 
the art will app eciate that these two translation 
alqorithms are substantially different, in that a page 
Sr ranslation occurs within a system hav.g 
consistently sized memory pages wh e an address 
,0 block translation may result in a defied address 
block having, for example, a size ranging from 
^entySght kilobyte block to eight megabytes of 

"'Thus upon reference to Figure 3. those skilled 
in the art will appreciate that by utilizing translation 
" ookasi'e biffed (TLB) 40 in conjunction with PTE 

ranslator 42. all effective addresses within mul- 
t scala processor 10 (see Figure 2). which u -^s 

he page table translation may be transited -nto a 

^« ;r^^i:t:?ra-^f^ 
-:r«.Sr-fSS 

" ^e manner depicted, every effective or virtua^ ad- 
dress within multiscalar processor 10 may oe 
T.nsJ6 into a real address within system mem- 
by utilizing memory management unit (MMU) 

'° AS those Skilled in the art will appreciate fetch- 
er 30 s utilized to couple fetch addresses to 
: Che/memory intertace 20 for target inst— 
which are selected by branch unit 28. For each 
55 rarget address coupled to fetcher 30 from branch 
ur^'ls a fetch address is determined and coupled 
o ache/memory interface 20. In the deP-^ted em- 
bodiment of the present invention, these addresses 
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may otten be determined by accessing translation 
TooLside buffer (TLB) 40 within mennory manage- 
ment unit 34. Thus, it should be apparent that m 
order to maiatain coherence within each mulfscalar 
processor 10 within multiprocessor data process.ng 
system 6 it will be necessary to ma.nta.n coher- 
ence between each translation lookas.de buffer 
(TLB) 40 within each multiscalar processor 10 
^ Referring now to Figure 4. there is depicted a 
high level logic flowchart which '""S^rates the -ru^. . 
atfon of a translation lookaside buffer invalidate 
aLB.) instruction by one multiscalar processor 
within multiprocessor data processing system 6 of 
o l^hose skilled ir. the art will apprec^te 
tna. a translation lookaside buffer invalidate (TLB\) 
„s,ruct.on is issued within the data processing 
.ys,em .n order to invalidate an entry within a 
;inslat.on lookaside buffer (TLB) which m,gh o«.- 
erw,se be utilized to translate effective or virtual 
Addresses .nto real addresses within system menn- 
orSch situations will, of course, occur as a 
result of the relocation of data or instructions within 
vs m memory or as a resu« of any other opera- 
tion which modifies the translation relationship be 
/ tlen an effective or virtual address and -ts real 
' address within system memory. 
' as depicted within Figure 4, the process be- 
gins at block 50 and thereafter passes to block 52 
Block 52 illustrates a determination o whether o 
.ot a translation lookas^^^^^^^^^^ 
inqtruction s within the e tAcouit y 
wit^n a f xed point processor in a multiscalar p.o- 
Tessor within multiprocessor data processing sys- 
tem 6 oT Figure 1. If this situation does not occur, 
e process merely iterates until su^^^^^^^^^^ as 
translation lookaside buffer '"^^^d^l, J^?'^ 
struction is detected within the "EXECUTE posi 
0 iTa fixed point processor unit within the sys- 
tem After detecting a translation lookaside buffer 
nvalidl (TLBl) instruction the process passes to 
block 54. Block 54 illustrates the Performance 
the translation lookaside buffer invalidate (TLBl) 
^struction locally, upon the translation lookasde 
buffer (TLB) within the local multiscalar processor. 
Thereafter, the process passes to block 56. 

Block 56 illustrates the arbitration for bus ac- 
cess by the local multiscalar processor and there- 
afte the process passes to block 58. Block 58 
Salsa^etermlnationofwhethe-^^^^^^ 
has been granted to bus 8 (see 
not the process returns iteratively to block 56 to 
a ain attempt to arbitrate for bus access. Afte 
gaining access to bus 8, as determmed^t block 
58 the process passes to block 60. Block 60 
nL rates the broadcasting on bus 8 of a translation 
ootasde buffer invalidate (TLBl) bus structure 
is associated with the translation lookaside 
Ltr invalidate (TLBl) instruction which has ,ust 



been executed. Upon reference to the forega^^^^ 
those skilled in the art will appreciate that an exist 
g memory bus structure may be ut.l..ed ^^h an 
expanded set of transaction ^od^s and that the 
, translation lookaside buffer ^^^^f^^^f^^^Jfl^, 
struction may be either an "index" based mval 
date, or may cor^prise the broadcasting of a full 
virtual address of the page which is being mva^ 
dated by the translation lookaside buffer invalidate 

" ^"rnCcess passes to block 62. Block^^^^ 
illustrates the determination of whether or not a 
"RETRY" message has been detected, indicating 
that one of the multiscalar processor systerns with- 
„ multiprocessor processing systems 6 has no 
accepted the broadcast translation lookaside buffer 
rnva'date (TLBl) bus structure. If this occurs, the 
pro ss returns to block 56 in an iterative fashio 
to once again attempt to broadcast the translation 
.0 Ok ide 'buffer invalidate (TLBl) bus structure in 
the manner described above. However, in the event 
a "RETRY" message is not detected, indicating 
that each multiscalar processor system within mul- 
tiscalar data processing system 6 has accepted the 
.5 b oadcast translation lookaside buffer -nvahdate 
(TLBl) bus structure, the process then passes to 
Llock 64. Block 64 once ag^n illustrates the ar- 
bitration by the local multiscalar Processor for bus 
access and the process then passes to block 66^ 
30 Block 66 illustrates a determination of whether or 
not access to the bus has been ga-ned. H no 
access has been gained, the process returns 
iteratively to block 84 until such time as bus ac- 
rpss has been gained. 
35 Rirring now to block 68, after galling access 
to the bus block 68 illustrates the broadcasting of a 
"SYNCHRO" signal by the initially executing pro- 
cessor within multiprocessor data processing sys- 
tem 10. This signal is utilized to determine whether 
,0 or not each multiscalar processor within the mul- 
Sprocessor data processing system has executed 
the translation lookaside buffer invalidate (TLBl) 
instruction. „pp_ 
Referring now to block 70 in the event a RE 
,5 TRY" message is detected, indicating that one or 
more processors within multiprocessor data pro- 
cessing system 6 have failed to comPte^e the 
translation lookaside buffer '"^'^''^af ^^T^^J, '"^ 
struction, the process returns 'teratively to block 64 
50 to once again attempt to obtain confirmation that an 
multiscalar processors within multiprocessor data 
processing system 6 have executed the transition 
lookaside buffer invalidate (TLBl) instruction. After 
receiving an indication that each processor has 
55 executed the instruction, the process passes to 
block 72 and returns. . 

With reference now to Figure 5, there is ce- 
picted a high level logic flowchart illustrating the 
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„ A, illustrated, this process begins a 
S Tooln .? 'Ur P-.S to block 102 

rerr:e7=---«-^'- 

event occurs. 3 

hlock 04 Block 104 illustrates a determ.nat.on of 
block 104. Bioc PENDING" has been 

"''''H-rn^rrt rJious translation lookaside 

to block 106 illustrates the assertion of the RE 
try' message, indicating that the current rr^^ul- 

''"Raring ag.n to «ock ,04, l„« eve^^^^^ 
« -TiRI PENDING" is not set, the process 

---Xrro^r'^^^ 

rj?,rlr;-i^^:strrSoS7:e 

rrr:eri;«— on looHas.e butt- 
er invalidate (TLBl) '"^^^^^t^' process illus- 

Referring now to block no, me ^ 
trated therein depicts the terminating of the d s 

£rd=:^oTf^=£ 

rrrprrcrr.^vrss'^i^is 
r.-s/Bi^j~P~^^^^ 



process merely Iterates until sucn «me as this 

to b»cK ,18, a« dsterm-;^ 
,K , r, -EXECUTE- position «lthln a fixed point 
or^r^f Sear, the process passes to block 
,20 lllusua.es the lnse,«on d t,« asso- 

rs.^xtrx°:s*,'s;-- 

ookaside buffer invalidate (TLBl) instruction. 

N^xt in accordance with an important fea ure 
of the present invention, the process passes to 
block 124 Block 124 illustrates a determination of 
whether or not all operations within -emory queue 
^fi have achieved coherency. That is, each mul- 
Scalar processor within multiprocessor data pro- 
essing system 6 is aware of the read and wnte 
operSns which are pending within memory 
nSeue 36 In the event all operations within mem- 
^rqueue 36 (see Figure 2)' have not achieved 
roherency the process merely iterates until such 
Tme L this condition occurs. Thereafter, after aH 
!2d and write operations within memory queue 36 
LTve achieved coherency, the process passes to 

^^r-ma^.^^^ 

been invalidated by the execution of the translation 

the modification to the translation lookaside buffer 

" ^'rr trprtss passes to block 128 Block 
128 iStrates the branching by this mulfscalar 
128 "'"Straxeb instruction ad- 

TeH^i ngletolfie'd Jslation lookaside 
.5 buffer' (TLB). As described above, this step .s nec- 
e sa^ to ensure that the instructions placed w,tl.n 
the execution position in the processor have been 
e^ieved utilizing the most recent data ^Mthln me 
trar^slation lookaside buffer (TLB). Thereafte . *e 
50 f aT"TLBI PENDING" is cleared and nom.al d.s- 
na?ch and execution of instructions is resumed^ 
Thereafter, the process passes to block 130 and 

'''^Finally, with reference to Figure 6, there is 
depicted a high level logic flowchart illustrating the 
" 3 nchronization of multiple translation lookaside 
buffer invalidate (TLBl) '-tructions v. h^ the .u 
tiprocessor data processing system of Figure 
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present invention, ''""Stratea tn P 

Qins at block 80 and thereafter P^^ff ;° hRO" 

Signal by a ^^'''"'^Jl^^e.^nme.en^ 
processor data P7^.\^.!^fJ'th^^^ merely 
this signal .s not ^atect^-^^"^^ P 
iterates until such time as a SYNCHHU 

detected. "ovNCHRO" signal, the pro- io 

'^""'TrSlock IT Block 84^ illustrates a 
cess passes to blocK translation 
determination of whether or n^ a 

pending with.n the ^ checking the '5 

state of the ildi penDING" flag is 

cesser. In the event JLBI J gg and 

not set. the process -^/f „ ^^^^ fp^NDING" flag is 
returns. Alternately, .f the TLB ^ ^ 

T CTssrr by processor of the 

ing translation looka.^^^^^^^ not require 

a multiprocessor system jv 3 hroniza- 

'""^^TvThichS^^^^^ 

tion and v^hich acn'eves t^roadcasting a 

translation lookas.de bufter (TLB) oy ^^^^^^.^^ 
bus structure assoc.ated w.th eac 

lookaside ^"3^^^^ and 
rnust be accepted by al m^^^^ P ^^^,,y 
by assuring that operf^ multiscalar 
cueue and ^^^^^eZ^ ^^^^^^^^^^ ^^^^'^^^^ ' 

line within a ^^^^^ m at the end 

branch to t^-f '^^S^^^^^^^^ buffer 
of the execution of ..^ing subsequent 

Claims 

, A method for maintaining translation lookaside 



tions a memory management unit for perform 
HQ read and write operations within the system 
^ nrv and an associated translation 
Tas e bl for translating eHective ad- 
dress into real memory addresses wrthin the 
svstem memory, the method compnsing. 
^'^'bldcasting a translation lookaside buffe 
invalidate bus structure along the bus in re 
po se 0 an execution of an associated trans- 
^n lookaside buffer invalidate instruction 
.Shin a selected one of the plurality of proces- 

'°''accepting the translation lookaside buffer 
invaUdate bus structure at any remaining one 
of te lurality of . processors only in response 
to an absence of a pending execution of a 
lUlatSn lookaside buffer invalidate instruc- 
tion therein; and .^^^ translation 

executing the associaxeo 
inokaside buffer invalidate instruction at all re 

Sruaur, a all remaining processors arr-ong 
Ihe plurality ot processors. 

^ ^^*"^«^rorp^rg'inrs- 
:rr;rrpiura;o,p™cess»^^j; 

s«urrrrerr„r-- 

struction. 

of the associated translaton lookaside bune 
invalidate instruction. 

buffer invalidate instruction ^^^^ 
,5 instructions therein. 

system memory. 
6 A system for maintaining translation lookaside 
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■» f«r fiyecutina multiple instruc- 

,„g read and «nle p „a„siaiion 

"■Tl b^«er to. fanslaling etl.otlv= ad- . 
nr2^.™rvadd,e.»s«i«n« 

lookaside buffer mvano processors 

'TTrls^ns: to " ^ 
ro:orS°rU,a.on lookaside.* I"- 

"*rsrrerr«=ssoofa»d.rans^' » 

among the plurality of processors. 

claimed in Claim 6, including 
7. A system as da,me ^^^^.^^ 

=r E:rlt:dr»= ' 

^:':;:cSa;STans,a,io^ lookaside b*r .n. 
validate instruction. 

^ -^-rtofr^^rss^"^^^^ ^ 

buffer invalidate Instruction. 

.^r. «^ claimed in Claim 8, including 
9. A system as claimea translation 

lookaside buffer nva 

Srs^rsTolloC— 
ecution of instructions therein. 

ations within the system memory. 
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ir-r-;, a,,^, with indiwtion, where »ppropn.te. 



no. 6 , June 1990 . LONG BEACH US 
irLE/'T'ranslation-lookaside buffer 

figures 3-5 * 

us 

^l^'iw^rplease of a processor following 
l^dle'ss'tSSaJJon Jrior to page access 

checking' , ^ , 
the whole document 
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